{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "authorship_tag": "ABX9TyOp1YZkXUo1Nss+Mi/VxTsF",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/sugarforever/LangChain-Tutorials/blob/main/expression-language/LangChain_Expression_Language_Runnable.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# LangChain Expression Language 中最重要的接口 Runnable\n",
        "\n",
        "`LEL` ( `LangChain Expression Language` ) 是一种以声明式方法，轻松地将链或组件组合在一起的机制。今天我们来介绍 `LEL` 中最重要的接口 `Runnable`。"
      ],
      "metadata": {
        "id": "OHpaChk4b5UN"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Runnable\n",
        "\n",
        "`LangChain` 定义了 `Runnable` 接口，绝大多数组件也实现了该接口。`Runnable` 接口定义了如下函数：\n",
        "\n",
        "- stream: 流式输出响应\n",
        "- invoke: 基于单一输入调用链\n",
        "- batch: 基于一组输入调用链\n",
        "- astream\n",
        "- ainvoke\n",
        "- abatch\n",
        "\n",
        "后三个函数为前三个函数的异步版本。更多内容请参考 [Expression Language Interface](https://python.langchain.com/docs/guides/expression_language/interface)。\n",
        "\n",
        "`ChatPromptTemplate` 与 `ChatOpenAI` 类都实现了 `Runnable` 接口。参考如下代码：\n"
      ],
      "metadata": {
        "id": "IAzbVA50cVt1"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "!pip install -q -U langchain\n",
        "\n",
        "from langchain.prompts import ChatPromptTemplate, Base\n",
        "from langchain.schema.runnable import Runnable\n",
        "prompt = ChatPromptTemplate.from_template(\"Hi, LEL!\")\n",
        "print(isinstance(prompt, Runnable))"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "MZWl8x3dmjsV",
        "outputId": "0ce4eef3-a06c-4647-f1ab-618440503839"
      },
      "execution_count": 1,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.4/1.4 MB\u001b[0m \u001b[31m7.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m90.0/90.0 kB\u001b[0m \u001b[31m6.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.4/49.4 kB\u001b[0m \u001b[31m2.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hTrue\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "### RunnablePassthrough 在管道中实现输入的传递"
      ],
      "metadata": {
        "id": "So9PZ9SigdjN"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "在通过管道构建LangChain链时，我们可能需要将原始输入变量传递给链式模型的后续步骤。我们可以使用类 `RunnablePassthrough` 来达到输入传递的目的。请参考一下示例："
      ],
      "metadata": {
        "id": "aOmBTf29hT15"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "`RunnablePassthrough` 接受输入，并如实地将输入作为自己的输出，从而达到传递的目的。这里我们实现一个 `Runnable` 来接受一个 `Dict` 类型的输入，并在控制台打印出键为 `name` 的值，以此来测试 `RunnablePassthrough` 的传递效果。\n",
        "\n",
        "在构成的链中，第一个 `RunnablePassthrough` 传递的是 `chain.invoke(\"Alex\")` 中的字符串参数 `Alex`，第二个 `RunnablePassthrough` 传递的是管道第一部分的输出，一个字典 dict:\n",
        "\n",
        "```json\n",
        "{\n",
        "  \"name\": \"Alex\"\n",
        "}\n",
        "```"
      ],
      "metadata": {
        "id": "88VQ51ARhshM"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "from langchain.schema.runnable import RunnablePassthrough, RunnableConfig, Input\n",
        "from langchain.load.serializable import Serializable\n",
        "from typing import Optional, Dict\n",
        "\n",
        "class StdOutputRunnable(Serializable, Runnable[Input, Input]):\n",
        "    @property\n",
        "    def lc_serializable(self) -> bool:\n",
        "        return True\n",
        "\n",
        "    def invoke(self, input: Dict, config: Optional[RunnableConfig] = None) -> Input:\n",
        "        print(f\"Hey, I received the name {input['name']}\")\n",
        "        return self._call_with_config(lambda x: x, input, config)\n",
        "\n",
        "chain = {\"name\": RunnablePassthrough()} | RunnablePassthrough() | StdOutputRunnable()\n",
        "\n",
        "chain.invoke(\"Simon\")"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "XsOMu9w5dV0-",
        "outputId": "52269096-4e50-4fcf-a042-9ef78f1d2fd0"
      },
      "execution_count": 46,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Hey, I received the name Simon\n"
          ]
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'name': 'Simon'}"
            ]
          },
          "metadata": {},
          "execution_count": 46
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "### itemgetter 实现输入的部分传递\n",
        "\n",
        "我们可以不需要传递输入的完整数据。当我们想要传递字典中的某一个键值，可以通过 `itemgetter` 函数实现。请参考如下代码："
      ],
      "metadata": {
        "id": "qG3fJp0Si-iX"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "from langchain.schema import StrOutputParser\n",
        "from langchain.schema.runnable import RunnablePassthrough, RunnableConfig, Input\n",
        "from langchain.load.serializable import Serializable\n",
        "from operator import itemgetter\n",
        "\n",
        "chain = {\"name\": itemgetter(\"user_name\") } | RunnablePassthrough() | StdOutputRunnable()\n",
        "\n",
        "chain.invoke({\"user_name\": \"Alex\"})"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "bVJPSSOqjckz",
        "outputId": "8bb676ac-1724-4bad-912e-0eb3d9b7b850"
      },
      "execution_count": 47,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Hey, I received the name Alex\n"
          ]
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'name': 'Alex'}"
            ]
          },
          "metadata": {},
          "execution_count": 47
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "### 连接模型的一个完整例子\n",
        "\n",
        "1. 利用 `Retriever` 增加外部数据获取能力\n",
        "2. 管道连接 `OpenAI` 的聊天模型完成问答"
      ],
      "metadata": {
        "id": "mzD5Emcsp9rq"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "!pip install -q -U openai chromadb tiktoken"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "r9XU7NXLrERq",
        "outputId": "8564e35b-a7b1-4acb-f9fc-3992153e3473"
      },
      "execution_count": 27,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\u001b[?25l     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/1.7 MB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K     \u001b[91m━━━\u001b[0m\u001b[90m╺\u001b[0m\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.1/1.7 MB\u001b[0m \u001b[31m4.3 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K     \u001b[91m━━━━━━━━━━━━━━━\u001b[0m\u001b[91m╸\u001b[0m\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.7/1.7 MB\u001b[0m \u001b[31m10.0 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K     \u001b[91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[91m╸\u001b[0m\u001b[90m━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.7 MB\u001b[0m \u001b[31m12.6 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.7/1.7 MB\u001b[0m \u001b[31m12.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "import os\n",
        "\n",
        "os.environ['OPENAI_API_KEY'] = '您的有效openai api key'"
      ],
      "metadata": {
        "id": "cpayrV5RrTBT"
      },
      "execution_count": 28,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "from langchain.vectorstores import Chroma\n",
        "from langchain.vectorstores.base import VectorStoreRetriever\n",
        "from langchain.embeddings import OpenAIEmbeddings\n",
        "from langchain.schema.runnable import RunnablePassthrough\n",
        "from langchain.chat_models import ChatOpenAI\n",
        "\n",
        "vectorstore = Chroma.from_texts([\"My name is VerySmallWoods, a software engineer based in Dublin.\"], embedding=OpenAIEmbeddings())\n",
        "retriever = vectorstore.as_retriever()"
      ],
      "metadata": {
        "id": "D4uoeGtVrMhI"
      },
      "execution_count": 48,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "retriever.__class__"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "_LoUPnlErxW2",
        "outputId": "c8ece014-1c08-4a85-a4fe-49df36922653"
      },
      "execution_count": 49,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "langchain.vectorstores.base.VectorStoreRetriever"
            ]
          },
          "metadata": {},
          "execution_count": 49
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "`VectorStoreRetriever` 的 `BaseRetriever` 基类实现了 `invoke` 函数。它接受字符串类型输入，并调用 `get_relevant_documents` 函数查询相关文档。\n",
        "\n",
        "```python\n",
        "class BaseRetriever(Serializable, Runnable[str, List[Document]], ABC):\n",
        "    # ......\n",
        "    def invoke(\n",
        "        self, input: str, config: Optional[RunnableConfig] = None\n",
        "    ) -> List[Document]:\n",
        "        return self.get_relevant_documents(input, **(config or {}))\n",
        "```"
      ],
      "metadata": {
        "id": "i5S_fCX0slX1"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "model = ChatOpenAI()\n",
        "template = \"\"\"Answer the question based only on the following context:\n",
        "{context}\n",
        "\n",
        "Question: {question}\n",
        "\"\"\"\n",
        "prompt = ChatPromptTemplate.from_template(template)\n",
        "\n",
        "chain = ( {\n",
        "    \"context\": retriever,\n",
        "    \"question\": RunnablePassthrough()\n",
        "} | prompt | model | StrOutputParser() )\n",
        "\n",
        "chain.invoke(\"Who am I?\")\n"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 36
        },
        "id": "RqhRpOnBqCua",
        "outputId": "8c65d325-3321-46c0-a590-3997c4538816"
      },
      "execution_count": 50,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "'You are VerySmallWoods, a software engineer based in Dublin.'"
            ],
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "string"
            }
          },
          "metadata": {},
          "execution_count": 50
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "chain.invoke(\"Where do I live?\")"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 36
        },
        "id": "HDxSyFCbtXSG",
        "outputId": "1a8be89a-300d-4f13-bda8-e7fc76284aef"
      },
      "execution_count": 51,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "'Based on the given context, you live in Dublin.'"
            ],
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "string"
            }
          },
          "metadata": {},
          "execution_count": 51
        }
      ]
    }
  ]
}